\subsection{Multicore Support}\label{subsec:mp-support}
The Muen separation kernel makes use of all logical processors available in a
system. The processor count of a specific hardware platform is declared in the
system policy, see section \ref{subsec:hardware}. This section describes how
the multicore setup is done during kernel startup.

Modern PC systems comply with the Intel MultiProcessor (MP\index{MP})
specification. In short, the Intel MP specification is an open-standard
describing enhancements to both operating systems and firmware to be able to
initialize, boot and operate x86 multi-processor systems. For more information
see \cite{intel:mp}.

After the hardware completes its part of the MP specification, one processor
has been negotiated to be the bootstrap processor (BSP\index{BSP}). All other
logical processors, called application processors (AP\index{AP}), halt until
they receive a specific inter-processor interrupt (IPI\index{IPI}) sequence.

\begin{figure}[h]
	\centering
	\input{graph_mp_overview}
	\caption{Multicore architecture}
	\label{fig:mp-overview}
\end{figure}

The BSP starts executing code as described in section \ref{subsec:init}. The
init code initializes the system and jumps into the main SPARK kernel. The
kernel running on the BSP is responsible for bootstraping the application
processors. It first enables its local APIC\index{APIC} to be able to send
inter-processor interrupts to the waiting AP processors. In order to wake them
up, the INIT-SIPI-SIPI\footnote{SIPI is short for Startup
IPI}\index{INIT}\index{SIPI}\index{IPI} IPI sequence must be sent to their
APICs as dictated by the MP specification. See also figure
\ref{fig:mp-overview} for an illustration of this process. The SIPI IPI
contains the physical address vector 0x0 of the trampoline code copied to
low-memory by the init code. The AP processors jump to this code after wakeup.
The trampoline performs the following steps:

\begin{enumerate}
	\item Set up 32-bit GDT\index{GDT}
	\item Switch CPU to protected mode
	\item Initialize DS and SS segments
	\item Jump to the AP entry point in the init code
\end{enumerate}

These steps initialize the APs to the same architectural state as the
bootloader does for the BSP: 32-bit protected mode with paging disabled.
Therefore, the final step is to let the AP processors jump to the common init
code described in section \ref{subsec:init}, starting by enabling PAE mode
(task 4).

\subsubsection{Per-Kernel Memory}
The Muen kernels operate fully symmetrical, i.e. code running on the different
logical processors is (binary) identical. Nevertheless, each kernel owns a
distinct stack page and also a page to store per-CPU data. This however is fully
transparent to the kernels as their virtual stack and global storage address
values are the same. This is achieved by using different page table structures
for each kernel. Page tables are created by the policy tool and setup on system
startup by the init code. The main kernel has no access to these structures in
memory and does not concern itself with memory management.

\subsubsection{Synchronization}
Since synchronization is error-prone and it is desirable to reduce inter-core
dependencies as much as possible, the Muen kernel tries to avoid locks and
other synchronization primitives. Nevertheless some form of synchronization is
required at certain key points in the code. This section describes the spinlock
and barrier mechanisms used by the kernel.

\paragraph{Spinlock}
The spinlock implementation uses the \texttt{XCHG} processor instruction to
atomically swap the value one with the contents of a lock variable in memory.
If the result of the set operation is zero, no other core currently holds the
lock and it is successfully acquired. If the result is one, the lock is
currently busy and the core must spin and retry again.

The spinlock is implemented as recommended by Intel SDM, volume 3A, section
8.10.6.1. Inside the lock's busy loop the \texttt{PAUSE} instruction is used to
improve performance and resource utilization (see Intel SDM \cite{IntelSDM},
volume 1, section 11.4.4.4).

\paragraph{Barrier}
As described in section \ref{subsec:design-scheduling}, to guarantee temporal
separation, the scheduling plans on the different logical processors must be
synchronized on major frame transition. This is achieved by a SPARK
implementation of a sense-reversing barrier as described in the book \emph{The
Art of Multiprocessor Programming} by Maurice Herlihy and Nir Shavit
\cite{Herlihy:2008:AMP:1734069}. This kind of barrier is safe for reuse and free
from issues of CPUs overtaking each other.
